✂️ Semantic Segmentatition
Introduction
Instead of predicting one label (cat, dog, etc.) per image, we will predict one label per pixel!
Each pixel should belong to a class (cat, dog, etc.) or to a background class.

Applications
| Autonomous driving | Medicine |
|---|---|
![]() |
![]() |
Representing the task

Similar to how we treat standard categorical values, we’ll create our target by one-hot encoding the class labels - essentially creating an output channel for each of the possible classes.

Models
Note that the model backbone can be a resnet, densenet, inception…
Naive model: Convolutions + Transpose Convolutions (stride=2)

Better model: Convs + TransposeConvs(stride=2) + Residual connections = UNET

History of the styate of the art
| Name | Description | Date | Instances |
|---|---|---|---|
| FCN | Fully Convolutional Network | 2014 | |
| SegNet | Encoder-decorder | 2015 | |
| Unet | Concatenate like a densenet | 2015 | |
| DeepLab | Atrous Convolution and CRF | 2016 | |
| ENet | Real-time video segmentation | 2016 | |
| PSPNet | Pyramid Scene Parsing Net | 2016 | |
| FPN | Feature Pyramid Networks slides | 2016 | Yes |
| DeepLabv3 | Increasing dilatation & field-of-view | 2017 | |
| LinkNet | Adds like a resnet | 2017 | |
| DeepLabv3+ | 2018 | ||
| PANet | Path Aggregation Network | 2018 | Yes |
| Panop FPN | Panoptic Feature Pyramid Networks | 2019 | ? |
| PointRend | Image Segmentation as Rendering | 2019 | ? |
Post-processing (OPTIONAL)
- Conditional Random Fields (CRF)
- Grabcut
Metric ands losses
- Pixel-wise cross entropy
- IoU (F0):
(Pred ∩ GT)/(Pred ∪ GT)=TP / TP + FP * FN - Dice (F1):
2 * (Pred ∩ GT)/(Pred + GT)=2·TP / 2·TP + FP * FN- Range from
0(worst) to1(best) - In order to formulate a loss function which can be minimized, we’ll simply use
1 − Dice
- Range from
Pixel-wise cross entropy
![]()
Dice loss

Notebook: CAMVID dataset

Reference
- Blog: An overview of semantic image segmentation
- Image Segmentation Using Deep Learning: A Survey Nov 2020
- https://www.jeremyjordan.me/semantic-segmentation
- https://www.jeremyjordan.me/evaluating-image-segmentation-models
- Check Res2Net
- Check catalyst segmentation tutorial (Ranger opt, albumentations, …)
- this repo

